VIPS: Simple Directory-Less Broadcast-Less Cache Coherence Protocol
نویسندگان
چکیده
Coherence in multicores introduces complexity and overhead (directory, state bits) in exchange for local caching, while being “invisible” to the memory consistency model. In this paper we show that a much simpler (directory-less/broadcast-less) multicore coherence provides almost the same performance without the complexity and overhead of a directory protocol. Motivated by recent efforts to simplify coherence for disciplined parallelism, we propose a hardware approach that does not require any application guidance. The cornerstone of our approach is a run-time, application-transparent, division of data into private and shared at a page-level granularity. This allows us to implement a dynamic write-policy (write-back for private, write-through for shared), simplifying the protocol to just two stable states. Self-invalidation of the shared data at synchronization points allows us to remove the directory (and invalidations) completely, with just a data-race-free guarantee (at the write-through granularity) from software. Allowing multiple simultaneous writers and merging their writes, relaxes the DRF guarantee to a word granularity and optimizes traffic. This leads to our main result: a virtually costless coherence that uses the same simple protocol for both shared, DRF data and private data (differentiating only in the timing of when to put data back in the last-level cache) while at the same time approaching the performance (within 3%) of a complex directory protocol.
منابع مشابه
Library Cache Coherence
Directory-based cache coherence is a popular mechanism for chip multiprocessors and multicores. The directory protocol, however, requires multicast for invalidation messages and the collection of acknowledgement messages, which can be expensive in terms of latency and network traffic. Furthermore, the size of the directory increases with the number of cores. We present Library Cache Coherence (...
متن کاملA SPEED Cache Coherence Protocol for an Optical Multi-Access Interconnect Architecture
This paper presents a low overhead, high performance cache coherence protocol designed to exploit high bandwidth point-to-point and broadcast features of optics. SPEED integrates the virtues of snoopy-based schemes and directory-based schemes into one eecient protocol. Directory-assist is used exclusively for read traac to eliminate unnecessary broadcasts while snoopy-assist is used exclusively...
متن کاملA NoC-level Support for Broadcast-based Coherence Protocols
Chip Multiprocessor Systems (CMPs) rely on a cache coherency protocol to maintain memory access coherence between cached data and main memory. The Hammer coherency protocol is appealing as it eliminates most of the space overhead when compared to a directory protocol. However, it generates much more traffic, thus stressing the NoC and having worse performance in terms of power consumption. When...
متن کاملDesign and Performance of Directory Caches for Scalable Shared Memory Multiprocessors
Recent research shows that the occupancy of the coherence controllers is a major performance bottleneck for distributed cache coherent shared memory multiprocessors. A significant part of the occupancy is due to the latency of accessing the directory, which is usually kept in DRAM memory. Most coherence controller designs that use protocol processors for executing the coherence protocol handler...
متن کاملPhase-Priority based Directory Coherence for Multicore Processor
As the number of cores in a single chip increases, a typical implementation of coherence protocol adds significant hardware and complexity overhead. Besides, the performance of CMP system depends on the data access latency, which is highly affected by coherence protocol and on-chip interconnect. In this paper, we propose PPB (PhasePriority Based) cache coherence protocol, an optimization of mod...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011